Skip to content

Conversation

@jfly
Copy link
Collaborator

@jfly jfly commented May 13, 2025

This completes step 1 of
#573 (comment).

@jfly
Copy link
Collaborator Author

jfly commented Jun 24, 2025

@brianmcgee @zimbatm, is there anything I can do to help move this forward?

@brianmcgee
Copy link
Member

It's unclear to me if this documents the current behaviour or the desired behaviour. If it's the desired behaviour, isn't there some work needing to be done in treefmt to support that before we publish a spec?

@jfly jfly marked this pull request as draft June 28, 2025 15:00
@jfly
Copy link
Collaborator Author

jfly commented Jun 28, 2025

If it's the desired behaviour, isn't there some work needing to be done in treefmt to support that before we publish a spec?

Yes, treefmt needs changes to support this. I'm happy to implement that (in this PR or a separate one). I just don't want to put in the effort until I know we like this approach.

@brianmcgee
Copy link
Member

I'll try to put my brain on this in the next few days. Been a busy few weeks 😫

@jfly
Copy link
Collaborator Author

jfly commented Jul 21, 2025

@brianmcgee, friendly ping

Copy link
Member

@brianmcgee brianmcgee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with some of your own musings in the comments you'd left. Otherwise, it seems like a sane spec and one which should be easy enough for formatters to follow.

I also like that this is opt-in.

@brianmcgee
Copy link
Member

P.S. Thanks for taking the time to flesh this out, and sorry for taking a while to get around to reviewing this.

@jfly jfly force-pushed the issue-573-stdin-spec branch 4 times, most recently from 8e6e1cb to 5671029 Compare July 21, 2025 22:25
@jfly jfly force-pushed the issue-573-stdin-spec branch from 5671029 to d8276a6 Compare July 22, 2025 20:02
@jfly jfly requested a review from brianmcgee July 22, 2025 20:09
Copy link
Contributor

@infinisil infinisil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@jfly
Copy link
Collaborator Author

jfly commented Jul 22, 2025

@brianmcgee @zimbatm, I could use another review on this. If we're happy with the state of this, then I'd like to work on the underlying implementation.

Copy link
Member

@zimbatm zimbatm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor things, but directionally it makes sense.

formatting "virtual files" passed via stdin.

A formatter **MUST** implement the Stdin Specification if its formatting behavior
can depend on the name of the file being formatted.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a bit hard to read

Copy link
Collaborator Author

@jfly jfly Aug 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but I don't know how to fix it. Does flipping the sentence around help? Suggestions appreciated.

If a formatter's behavior can depend on the name of the file being formatted, then it MUST implement the Stdin Specification.

"stdin mode". In stdin mode, the formatter reads file contents from stdin
rather than the filesystem.
- The formatter _MAY_ understand `--stdin-filepath=<path>` as well, but **MUST**
understand the space separated variant.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would make sense to outline the purpose of the <path> argument explicitly. Both you and I know why it's there, but maybe not the reader.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the path argument is not needed by the tool, it's also valid to ignore it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've taken a stab at this. Please take a look.

vanilla [Formatter Specification](/reference/formatter-spec), and additionally
satisfy the following:

### 1. `--stdin-filepath` flag
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### 1. `--stdin-filepath` flag
### 1. `--stdin` flag

KISS?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm lightly opposed to this, for 2 reasons:

  1. I think --stdin-filepath is clearer that the option takes a value.
  2. treefmt itself currently implements a --stdin flag with different semantics, and I would like for treefmt to implement the new stdin spec. Changing this might be awkward?

WDYT?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jfly FWIW, that sounds reasonable to me (I agree with and I would like for treefmt to implement the new stdin spec).

Comment on lines +7 to +8
Formatters **MAY** also implement the Stdin Specification, which allows
formatting "virtual files" passed via stdin.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this support expressed in the treefmt.toml config?

Copy link
Collaborator Author

@jfly jfly Aug 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking we'd add a new formatter.<mylanguage>.stdin (or stdin_supported) bool that defaults to false if unspecified. Feedback appreciated.

Do you think the stdin spec is the place to put this? I was thinking this would go in our config docs.

@cstrahan
Copy link

Any remaining work to be done here, or is this just waiting for review?

I'd be happy to lend a hand, if there's any way I can help.

@jfly
Copy link
Collaborator Author

jfly commented Oct 27, 2025

@cstrahan I think this is ready for implementation. I'm pretty swamped right now. If you're up for putting together an implementation, I'd be happy to code review!

Comment on lines +31 to +39
- `--stdin-filepath <path>` is an optional flag that puts the formatter in
"stdin mode". In stdin mode, the formatter reads file contents from stdin
rather than the filesystem.
- The formatter _MAY_ alter its behavior based on the given `<path>`. For
example, if a there are different formatting rules in different
directories. If the formatter's behavior doesn't depend on the given
`<path>`, it's ok to ignore it.
- The formatter _MAY_ understand `--stdin-filepath=<path>` as well, but **MUST**
understand the space separated variant.
Copy link

@cormacrelf cormacrelf Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are tools out there whose cmdline parsers just do not support double dash flags with values. For example:

https://pkg.go.dev/flag#hdr-Command_line_flag_syntax

According to that documentation, any go-based formatter cannot support this without swapping away from the standard library's flags package. Treefmt itself is saved by using cobra.

Here are a few examples of stdin mode from a repo of mine. I count 5 6 7 unique ways of specifying this:

  • detects stdin, doesn't need a filepath: rustfmt, gofmt, alejandra --quiet
  • requires a - path: ruff format --silent -, mdformat -, tofu fmt -
  • buildifier -path=$path -
  • taplo format - --stdin-filepath=$path
  • prettier --stdin-filepath=$path
  • biome format --write --stdin-file-path $path
  • An old internal tool using --stdin-filename=...

Given this extensive variation, I believe the structure of the command line flags should not be specified AT ALL. You should just let the user describe what the tool accepts. The way jj fix configures this is quite good: it substitutes $path in command args like --stdin-filepath=$path (coincidentally every tool must run in stdin mode). Making treefmt work with everything it possibly can is more important than making a nice spec. I dread having to make PRs to 15-year-old barely-maintained build tools to make them support the spec, or writing endless formatting wrapper scripts. Basically please don't make me do that.

If the tool does support stdin mode, I don't see why you would ever want to run it in non-stdin mode. For example, rustfmt has both modes, but rustfmt file.rs does a filesystem walk from there for every mod statement it discovers. That's got to be bad for performance if treefmt is scheduling one rustfmt per file, since a file is formatted N times if it's N modules deep in a tree! (Edit: Unless you're running multiple files per formatter instance, in which case rustfmt does not duplicate work and will keep its code in cache better, but you still may not want this if some other formatter is single threaded.)

Therefore, I think you should just add a toml option saying "this tool runs in stdin mode", or perhaps just an alternate set of command line flags to use in stdin mode. Stdin mode is preferable in many cases, for example you can coordinate 5 formatter processes and pipe their stdin all the way through without buffering after every formatter. So many tools support it already with so many different flags that it's worth being flexible so this can be useful without waiting 2 years for a bunch of CLI flag adjustments to go through.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to that documentation, any go-based formatter cannot support this without swapping away from the standard library's flags package

Sorry, I don't follow this point. From the page you linked to, I see this:

The following forms are permitted:

--flag   // double dashes are also permitted

What am I missing?

Basically please don't make me do that. [...] it's worth being flexible so this can be useful without waiting 2 years for a bunch of CLI flag adjustments to go through.

I hear you. We will definitely need a compatibility story, but it's TBD if that lives in treefmt itself, or treefmt-nix.

If the tool does support stdin mode, I don't see why you would ever want to run it in non-stdin mode. [...] Edit: Unless you're running multiple files per formatter instance, in which case rustfmt does not duplicate work and will keep its code in cache better

This is exactly why: invoking the formatter N / B times (where B is our batch size) rather than N times.

you still may not want this if some other formatter is single threaded

This is a good question. We probably kind of assume that formatters are single threaded.

Copy link

@cormacrelf cormacrelf Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it seems the docs page for go flags is just terrible. It says "the following forms are permitted" and enumerates -flag and --flag as if to be exhaustive, but doesn't list --flag=value or --flag value. These do in fact work, as I just discovered. Notoriously posix getopt does not support that.

Compatibility story... TBD if that lives in treefmt itself, or treefmt-nix.

Please put it in treefmt. I don't want to use treefmt-nix. Sorry but I'm not learning yet another config schema, especially an opinionated one. And presuming you mean treefmt-nix would contain wrappers for all the well-known formatters to force them to fit the spec, one of the worst parts of Nix generally is that e.g. clang is buried about 40 layers deep under hacky bash wrappers each made of 6 layers of @@TEMPLATING_SYNTAX@@ and you never know what flags are actually being passed to the thing. Aside from making treefmt difficult to use for anything not covered by treefmt-nix, I don't know why you'd want to perpetuate that or take on that workload.

For everyday difficulty, try this example... let's say you make a typo over and over again. You have a code formatter to replace that specific typo. (This is legitimately very useful with jj fix for renaming a type or rewriting code across 80 commits at once, including code that was introduced and then replaced in a later commit.) You use sd to do it:

[formatter.cameron]
command = "sd"
options = ["Totanic", "Titanic"]
includes = ["*.md"]

This is instantly not compatible with the stdin spec. Basic things like this should be silently using tempfiles and not bothering me about spec compliance.

The batching etc

I think best solution is have a batch mode and a stdin mode. If no stdin mode is specified we use a temp file for stdin tasks. If no regular (batch mode) options are specified we use stdin when batch mode is called for. Bikeshed the options names but basically

[formatter.rust]
command = "rustfmt"
options = ["--edition", "2018"]
stdin_options = ["--edition", "2018", "--stdin-filepath=$path"]
includes = ["*.rs"]

Copy link
Collaborator Author

@jfly jfly Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You use sd to do it:

[formatter.cameron]
command = "sd"
options = ["Totanic", "Titanic"]
includes = ["*.md"]

This is instantly not compatible with the stdin spec.

I don't see the problem here. You are correct that this tool does not implement the stdin spec, but what's the issue? It's optional to implement. From the proposed spec:

Formatters MAY also implement the Stdin Specification [...] A formatter MUST implement the Stdin Specification if its formatting behavior can depend on the name of the file being formatted.

I think best solution is have a batch mode and a stdin mode

I like your options and stdin_options suggestion here!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my point is it's optional, but if you don't implement it, what does treefmt --stdin scene-01.md do? And if your answer is that treefmt uses a tempfile because it doesn't support the spec, how do we know that? And how does treefmt decide to use --stdin-filepath=file.rs for something that does support the spec?

Realistically if we follow this to its conclusion, we get to either "we always pass --stdin-filepath=... and if the tool does not support it, that's their fault", or "we sometimes pass it, depending on a config flag for each formatter". Either option has a dead zone of formatters that don't support the spec when running treefmt --stdin which can't be supported except with a wrapper script (which, notably, you can't write easily in bash, because of getopt not supporting double dash flags).

The stdin_options approach does not have a dead zone and only comes at the cost of repeating yourself slightly.

Copy link
Collaborator Author

@jfly jfly Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if your answer is that treefmt uses a tempfile because it doesn't support the spec, how do we know that?

I'm sorry, but I don't see the problem here. This is something treefmt already supports, and I'm not proposing changing that. It uses a tempfile under the hood because how else could it possibly work?

And how does treefmt decide to use --stdin-filepath=file.rs for something that does support the spec?

We could add a boolean for stdin_spec, but I hear you: at the point that we're implementing something like stdin_options, it's probably simpler to just be opinionated unopinionated.

Copy link

@cormacrelf cormacrelf Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It uses a tempfile under the hood because how else could it possibly work?

I asked only to prompt the necessity of some kind of config, instead of unconditionally assuming it supports the spec and adding --stdin-filepath=.... I think we understand each other now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants